{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/cookbooks/cleanlab_tlm_rag.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Trustworthy RAG with the Trustworthy Language Model\n",
    "\n",
    "This tutorial demonstrates how to use Cleanlab's [Trustworthy Language Model](https://cleanlab.ai/blog/trustworthy-language-model/) (TLM) in any RAG system, to score the trustworthiness of answers and **automatically catch incorrect/hallucinated responses in real-time**.\n",
    "\n",
    "Today's RAG and Agent applications often produce unreliable responses, because they depend on LLMs which are fundamentally unreliable. Cleanlab’s [Trustworthy Language Model](https://cleanlab.ai/blog/trustworthy-language-model/) scores the trustworthiness of every LLM response in real-time, using state-of-the-art uncertainty estimates for LLMs. Cleanlab works effectively **no matter your RAG architecture or retrieval and indexing processes**. \n",
    "\n",
    "To diagnose when RAG answers cannot be trusted, this tutorial demonstrates how to replace your LLM with Cleanlab's to [generate responses and score their trustworthiness](https://docs.llamaindex.ai/en/stable/examples/llm/cleanlab/). You can alternatively use Cleanlab only to score responses from your unmodified RAG system and run other real-time Evals, see [our Evaluation tutorial](https://docs.llamaindex.ai/en/stable/examples/evaluation/cleanlab/).\n",
    "\n",
    "## Setup\n",
    "\n",
    "RAG is all about connecting LLMs to data, to better inform their answers. This tutorial uses Nvidia's Q1 FY2024 earnings report as an example dataset.\n",
    "Use the following commands to download the data (earnings report) and store it in a directory named `data/`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget -nc 'https://cleanlab-public.s3.amazonaws.com/Datasets/NVIDIA_Financial_Results_Q1_FY2024.md'\n",
    "!mkdir -p ./data\n",
    "!mv NVIDIA_Financial_Results_Q1_FY2024.md data/"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's install the required dependencies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-llms-cleanlab llama-index llama-index-embeddings-huggingface"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then initialize Cleanlab's TLM. Here we initialize a CleanlabTLM object with default settings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.cleanlab import CleanlabTLM\n",
    "\n",
    "# set api key in env or in llm\n",
    "# get free API key from: https://cleanlab.ai/\n",
    "# import os\n",
    "# os.environ[\"CLEANLAB_API_KEY\"] = \"your api key\"\n",
    "\n",
    "llm = CleanlabTLM(api_key=\"your_api_key\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: If you encounter `ValidationError` during the above import, please upgrade your python version to >= 3.11\n",
    "\n",
    "You can achieve better results by playing with the TLM configurations outlined in this [advanced TLM tutorial](https://help.cleanlab.ai/tlm/tutorials/tlm_advanced/).\n",
    "\n",
    "For example, if your application requires OpenAI's GPT-4 model and restrict the output tokens to 256, you can configure it using the `options` argument:\n",
    "\n",
    "```python\n",
    "options = {\n",
    "    \"model\": \"gpt-4\",\n",
    "    \"max_tokens\": 256,\n",
    "}\n",
    "llm = CleanlabTLM(api_key=\"your_api_key\", options=options)\n",
    "```\n",
    "\n",
    "Let's start by asking the LLM a simple question."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NVIDIA's ticker symbol is NVDA.\n"
     ]
    }
   ],
   "source": [
    "response = llm.complete(\"What is NVIDIA's ticker symbol?\")\n",
    "print(response)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TLM not only provides a response but also includes a **trustworthiness score** indicating the confidence that this response is good/accurate. You can access this score from the response itself."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'trustworthiness_score': 0.9884868983475051}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "response.additional_kwargs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build a RAG pipeline with TLM\n",
    "\n",
    "Now let's integrate TLM into a RAG pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import Settings\n",
    "from llama_index.embeddings.huggingface import HuggingFaceEmbedding\n",
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
    "\n",
    "Settings.llm = llm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Specify Embedding Model\n",
    "\n",
    "RAG uses an embedding model to match queries against document chunks to retrieve the most relevant data. Here we opt for a no-cost, local embedding model from Hugging Face. You can use any other embedding model by referring to this [LlamaIndex guide](https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings/#embeddings)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Settings.embed_model = HuggingFaceEmbedding(\n",
    "    model_name=\"BAAI/bge-small-en-v1.5\"\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load Data and Create Index + Query Engine\n",
    "\n",
    "Let's create an index from the documents stored in the data directory. The system can index multiple files within the same folder, although for this tutorial, we'll use just one document.\n",
    "We stick with the default index from LlamaIndex for this tutorial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "documents = SimpleDirectoryReader(\"data\").load_data()\n",
    "# Optional step since we're loading just one data file\n",
    "for doc in documents:\n",
    "    # file_path wouldn't be a useful metadata to add to LLM's context since our datasource contains just 1 file\n",
    "    doc.excluded_llm_metadata_keys.append(\"file_path\")\n",
    "index = VectorStoreIndex.from_documents(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The generated index is used to power a query engine over the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that TLM is agnostic to the index and the query engine used for RAG, and is compatible with any choices you make for these components of your system.\n",
    "\n",
    "In addition, you can just use TLM's trustworthiness score in an existing custom-built RAG pipeline (using any other LLM generator, streaming or not). <br>\n",
    "To achieve this, you'd need to fetch the prompt sent to LLM (including system instructions, retrieved context, user query, etc.) and the returned response. TLM requires both to predict trustworthiness.\n",
    "\n",
    "Details about this approach and example code are available [here](https://docs.llamaindex.ai/en/stable/examples/evaluation/cleanlab/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extract Trustworthiness Score from LLM response\n",
    "\n",
    "As we saw earlier, Cleanlab's TLM also provides the `trustworthiness_score` in addition to the text, in its response to the prompt. \n",
    "\n",
    "To get this score out when TLM is used in a RAG pipeline, Llamaindex provides an [instrumentation](https://docs.llamaindex.ai/en/stable/module_guides/observability/instrumentation/#instrumentation) tool that allows us to observe the events running behind the scenes in RAG. <br> \n",
    "We can utilise this tooling to extract `trustworthiness_score` from LLM's response.\n",
    "\n",
    "Let's define a simple event handler that stores this score for every request sent to the LLM. You can refer to [Llamaindex's](https://docs.llamaindex.ai/en/stable/examples/instrumentation/basic_usage/) documentation for more details on instrumentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Dict, List, ClassVar\n",
    "from llama_index.core.instrumentation.events import BaseEvent\n",
    "from llama_index.core.instrumentation.event_handlers import BaseEventHandler\n",
    "from llama_index.core.instrumentation import get_dispatcher\n",
    "from llama_index.core.instrumentation.events.llm import LLMCompletionEndEvent\n",
    "\n",
    "\n",
    "class GetTrustworthinessScore(BaseEventHandler):\n",
    "    events: ClassVar[List[BaseEvent]] = []\n",
    "    trustworthiness_score: float = 0.0\n",
    "\n",
    "    @classmethod\n",
    "    def class_name(cls) -> str:\n",
    "        \"\"\"Class name.\"\"\"\n",
    "        return \"GetTrustworthinessScore\"\n",
    "\n",
    "    def handle(self, event: BaseEvent) -> Dict:\n",
    "        if isinstance(event, LLMCompletionEndEvent):\n",
    "            self.trustworthiness_score = event.response.additional_kwargs[\n",
    "                \"trustworthiness_score\"\n",
    "            ]\n",
    "            self.events.append(event)\n",
    "\n",
    "\n",
    "# Root dispatcher\n",
    "root_dispatcher = get_dispatcher()\n",
    "\n",
    "# Register event handler\n",
    "event_handler = GetTrustworthinessScore()\n",
    "root_dispatcher.add_event_handler(event_handler)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For each query, we can fetch this score from `event_handler.trustworthiness_score`. Let's see it in action."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Answering queries with our RAG system\n",
    "\n",
    "Let's try out our RAG pipeline based on TLM. Here we pose questions with differing levels of complexity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Optional: Define `display_response` helper function\n",
    "\n",
    "\n",
    "# This method presents formatted responses from our TLM-based RAG pipeline. It parses the output to display both the text response itself and the corresponding trustworthiness score.\n",
    "def display_response(response):\n",
    "    response_str = response.response\n",
    "    trustworthiness_score = event_handler.trustworthiness_score\n",
    "    print(f\"Response: {response_str}\")\n",
    "    print(f\"Trustworthiness score: {round(trustworthiness_score, 2)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Easy Questions\n",
    "\n",
    "We first pose straightforward questions that can be directly answered by the provided data and can be easily located within a few lines of text."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: NVIDIA's total revenue in the first quarter of fiscal 2024 was $7.19 billion.\n",
      "Trustworthiness score: 1.0\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"What was NVIDIA's total revenue in the first quarter of fiscal 2024?\"\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: The GAAP earnings per diluted share for the quarter (Q1 FY24) was $0.82.\n",
      "Trustworthiness score: 0.99\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"What was the GAAP earnings per diluted share for the quarter?\"\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: Jensen Huang, NVIDIA's CEO, commented on the significant transitions the computer industry is undergoing, particularly in the areas of accelerated computing and generative AI.\n",
      "Trustworthiness score: 0.99\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"What significant transitions did Jensen Huang, NVIDIA's CEO, comment on?\"\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TLM returns high trustworthiness scores for these responses, indicating high confidence they are accurate. After doing a quick fact-check (reviewing the original earnings report), we can confirm that TLM indeed accurately answered these questions. In case you're curious, here are relevant excerpts from the data context for these questions:\n",
    "\n",
    "> NVIDIA (NASDAQ: NVDA) today reported revenue for the first quarter ended April 30, 2023, of $7.19 billion, ...\n",
    "\n",
    "> GAAP earnings per diluted share for the quarter were $0.82, up 28% from a year ago and up 44% from the previous quarter.\n",
    "\n",
    "> Jensen Huang, founder and CEO of NVIDIA, commented on the significant transitions the computer industry is undergoing, particularly accelerated computing and generative AI, ..."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Questions without Available Context \n",
    "\n",
    "Now let's see how TLM responds to queries that *cannot* be answered using the provided data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: The report indicates that NVIDIA's professional visualization revenue declined by 53% year-over-year. While the specific factors contributing to this decline are not detailed in the provided information, several potential reasons can be inferred:\n",
      "\n",
      "1. **Market Conditions**: The overall market for professional visualization may have faced challenges, leading to reduced demand for NVIDIA's products in this segment.\n",
      "\n",
      "2. **Increased Competition**: The presence of competitors in the professional visualization space could have impacted NVIDIA's market share and revenue.\n",
      "\n",
      "3. **Economic Factors**: Broader economic conditions, such as inflation or reduced spending in industries that utilize professional visualization tools, may have contributed to the decline.\n",
      "\n",
      "4. **Transition to New Technologies**: The introduction of new technologies, such as the NVIDIA Omniverse™ Cloud, may have shifted focus away from traditional professional visualization products, affecting revenue.\n",
      "\n",
      "5. **Product Lifecycle**: If certain products were nearing the end of their lifecycle or if there were delays in new product launches, this could have impacted sales.\n",
      "\n",
      "Overall, while the report does not specify the exact reasons for the decline, these factors could be contributing elements based on industry trends and market dynamics.\n",
      "Trustworthiness score: 0.76\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"What factors as per the report were responsible to the decline in NVIDIA's proviz revenue?\"\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The lower TLM trustworthiness score indicates a bit more uncertainty about the response, which aligns with the lack of information available. Let's try some more questions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: The report does not explicitly explain the reasons for the year-over-year decrease in NVIDIA's Gaming revenue. However, it does provide context regarding the overall performance of the gaming segment, noting that first-quarter revenue was $2.24 billion, which is down 38% from a year ago but up 22% from the previous quarter. This suggests that while there may have been a decline compared to the same period last year, there was a recovery compared to the previous quarter. Factors that could contribute to the year-over-year decline might include market conditions, competition, or changes in consumer demand, but these specifics are not detailed in the report.\n",
      "Trustworthiness score: 0.92\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"How does the report explain why NVIDIA's Gaming revenue decreased year over year?\"\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: The context information provided does not include specific details about the industry average for dividend payouts. Therefore, I cannot directly compare NVIDIA's dividend payout for this quarter to the industry average. However, NVIDIA announced a quarterly cash dividend of $0.04 per share for shareholders of record on June 8, 2023. To assess how this compares to the industry average, one would need to look up the average dividend payout for similar companies in the technology or semiconductor industry.\n",
      "Trustworthiness score: 0.93\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"How does NVIDIA's dividend payout for this quarter compare to the industry average?\",\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We observe that TLM demonstrates the ability to recognize the limitations of the available information. It refrains from generating speculative responses or hallucinations, thereby maintaining the reliability of the question-answering system. This behavior showcases an understanding of the boundaries of the context and prioritizes accuracy over conjecture. \n",
    "\n",
    "### Challenging Questions\n",
    "\n",
    "Let's see how our RAG system responds to harder questions, some of which may be misleading."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: NVIDIA's revenue for the first quarter of fiscal 2024 was $7.19 billion, and it was reported that this revenue was up 19% from the previous quarter. To find the revenue for the previous quarter, we can use the following calculation:\n",
      "\n",
      "Let \\( x \\) be the revenue for the previous quarter. \n",
      "\n",
      "The equation based on the 19% increase is:\n",
      "\\[ \n",
      "x + 0.19x = 7.19 \\text{ billion} \n",
      "\\]\n",
      "\\[ \n",
      "1.19x = 7.19 \\text{ billion} \n",
      "\\]\n",
      "\\[ \n",
      "x = \\frac{7.19 \\text{ billion}}{1.19} \\approx 6.04 \\text{ billion} \n",
      "\\]\n",
      "\n",
      "Now, to find the decrease in revenue from the previous quarter to this quarter:\n",
      "\\[ \n",
      "\\text{Decrease} = 7.19 \\text{ billion} - 6.04 \\text{ billion} \\approx 1.15 \\text{ billion} \n",
      "\\]\n",
      "\n",
      "Thus, NVIDIA's revenue decreased by approximately $1.15 billion this quarter compared to the last quarter.\n",
      "Trustworthiness score: 0.6\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"How much did Nvidia's revenue decrease this quarter vs last quarter, in terms of $?\"\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: The report mentions the following companies: Microsoft and Dell. ServiceNow is also mentioned in the context, but it is not specified in the provided highlights. Therefore, the companies explicitly mentioned in the report are Microsoft and Dell.\n",
      "Trustworthiness score: 0.6\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"This report focuses on Nvidia's Q1FY2024 financial results. There are mentions of other companies in the report like Microsoft, Dell, ServiceNow, etc. Can you name them all here?\",\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: In NVIDIA's Q1 FY2024 financial results, the following RTX GPU models were officially announced:\n",
      "\n",
      "1. **GeForce RTX 4060 family of GPUs**\n",
      "2. **GeForce RTX 4070 GPU**\n",
      "3. **Six new NVIDIA RTX GPUs for mobile and desktop workstations**\n",
      "\n",
      "This totals to **eight RTX GPU models** announced.\n",
      "Trustworthiness score: 0.74\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"How many RTX GPU models, including all custom versions released by third-party manufacturers and all revisions across different series, were officially announced in NVIDIA's Q1 FY2024 financial results?\",\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Response: To calculate the projected annual revenue for NVIDIA's Data Center segment if it maintains its Q1 FY2024 quarter-over-quarter growth rate, we first need to determine the growth rate from Q4 FY2023 to Q1 FY2024.\n",
      "\n",
      "NVIDIA reported a record Data Center revenue of $4.28 billion for Q1 FY2024. The revenue for the previous quarter (Q4 FY2023) can be calculated as follows:\n",
      "\n",
      "Let \\( R \\) be the revenue for Q4 FY2023. The growth rate from Q4 FY2023 to Q1 FY2024 is given by:\n",
      "\n",
      "\\[\n",
      "\\text{Growth Rate} = \\frac{\\text{Q1 Revenue} - \\text{Q4 Revenue}}{\\text{Q4 Revenue}} = \\frac{4.28 - R}{R}\n",
      "\\]\n",
      "\n",
      "We know that the overall revenue for Q1 FY2024 is $7.19 billion, which is up 19% from the previous quarter. Therefore, we can express the revenue for Q4 FY2023 as:\n",
      "\n",
      "\\[\n",
      "\\text{Q1 FY2024 Revenue} = \\text{Q4 FY2023 Revenue} \\times 1.19\n",
      "\\]\n",
      "\n",
      "Substituting the known value:\n",
      "\n",
      "\\[\n",
      "7.19 = R \\times 1.19\n",
      "\\]\n",
      "\n",
      "Solving for \\( R \\):\n",
      "\n",
      "\\[\n",
      "R = \\frac{7.19}{1.19} \\approx 6.03 \\text{ billion}\n",
      "\\]\n",
      "\n",
      "Now, we can calculate the Data Center revenue for Q4 FY2023. Since we don't have the exact figure for the Data Center revenue in Q4 FY2023, we will assume that the Data Center revenue also grew by the same percentage as the overall revenue. \n",
      "\n",
      "Now, we can calculate the quarter-over-quarter growth rate for the Data Center segment:\n",
      "\n",
      "\\[\n",
      "\\text{Growth Rate} = \\frac{4.28 - R_D}{R_D}\n",
      "\\]\n",
      "\n",
      "Where \\( R_D \\) is the Data Center revenue for Q4 FY2023. However, we need to find \\( R_D \\) first. \n",
      "\n",
      "Assuming the Data Center revenue was a certain percentage of the total revenue in Q4 FY2023, we can estimate it. For simplicity, let's assume the Data Center revenue was around 50% of the total revenue in Q4 FY2023 (this is a rough estimate, as we don't have the exact figure).\n",
      "\n",
      "Thus, \\( R_D \\approx 0.5 \\times 6\n",
      "Trustworthiness score: 0.69\n"
     ]
    }
   ],
   "source": [
    "response = query_engine.query(\n",
    "    \"If NVIDIA's Data Center segment maintains its Q1 FY2024 quarter-over-quarter growth rate for the next four quarters, what would be its projected annual revenue?\",\n",
    ")\n",
    "display_response(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TLM automatically alerts us that these answers are unreliable, by the low trustworthiness score. RAG systems with TLM help you properly exercise caution when you see low trustworthiness scores. Here are the correct answers to the aforementioned questions:\n",
    "\n",
    "> NVIDIA's revenue increased by $1.14 billion this quarter compared to last quarter.\n",
    "\n",
    "> Google, Amazon Web Services, Microsoft, Oracle, ServiceNow, Medtronic, Dell Technologies.\n",
    "\n",
    "> There is not a specific total count of RTX GPUs mentioned.\n",
    "\n",
    "> Projected annual revenue if this growth rate is maintained for the next four quarters: approximately $26.34 billion.\n",
    "\n",
    "With TLM, you can easily increase trust in any RAG system! \n",
    "\n",
    "Read [TLM's performance benchmarks](https://cleanlab.ai/blog/trustworthy-language-model/) to learn about the effectiveness of the trustworthiness scoring. <br />\n",
    "Rather than replacing your LLM with Cleanlab's (as done in this tutorial), you can alternatively use Cleanlab only to detect incorrect responses from your existing unmodified RAG system; check out [our real-time Evaluation tutorial](https://docs.llamaindex.ai/en/stable/examples/evaluation/cleanlab/)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
